Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines
نویسندگان
چکیده
Alternating Gibbs sampling is the most common scheme used for sampling from Restricted Boltzmann Machines (RBM), a crucial component in deep architectures such as Deep Belief Networks. However, we find that it often does a very poor job of rendering the diversity of modes captured by the trained model. We suspect that this hinders the advantage that could in principle be brought by training algorithms relying on Gibbs sampling for uncovering spurious modes, such as the Persistent Contrastive Divergence algorithm. To alleviate this problem, we explore the use of tempered Markov Chain Monte-Carlo for sampling in RBMs. We find both through visualization of samples and measures of likelihood that it helps both sampling and learning.
منابع مشابه
Parallel Tempering for Training of Restricted Boltzmann Machines
Alternating Gibbs sampling between visible and latent units is the most common scheme used for sampling from Restricted Boltzmann Machines (RBM), a crucial component in deep architectures such as Deep Belief Networks (DBN). However, we find that it often does a very poor job of rendering the diversity of modes captured by the trained model. We suspect that this property hinders RBM training met...
متن کاملFrom Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets
We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gra...
متن کاملA bound for the convergence rate of parallel tempering for sampling restricted Boltzmann machines
Sampling from restricted Boltzmann machines (RBMs) is done by Markov chain Monte Carlo (MCMC) methods. The faster the convergence of the Markov chain, the more e ciently can high quality samples be obtained. This is also important for robust training of RBMs, which usually relies on sampling. Parallel tempering (PT), an MCMC method that maintains several replicas of the original chain at higher...
متن کاملTraining restricted Boltzmann machines: An introduction
Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpo...
متن کاملLearning and Evaluating Boltzmann Machines
We provide a brief overview of the variational framework for obtaining deterministic approximations or upper bounds for the log-partition function. We also review some of the Monte Carlo based methods for estimating partition functions of arbitrary Markov Random Fields. We then develop an annealed importance sampling (AIS) procedure for estimating partition functions of restricted Boltzmann mac...
متن کامل